Abstract
Double hit diffuse large B-cell lymphoma (DLBCL) and Burkitt lymphoma (BL) are currently defined as dark zone lymphoma (DZL) because the tumor cells are believed to originate from the densely packed B-cells in germinal centers. These cases are associated with poor outcomes and require more aggressive treatment protocols. It has been established that DZ lymphoma have specific expression profiles that distinguish them from other types of DLBCL. Prognosis of DH DLBCL and BL improved significantly with the use of dose intense regimens. However, nonBL and nonDH DZL are not well studied. We used targeted transcriptomic data with artificial intelligence (AI) to first establish an AI-defined transcriptomic signature. Then we used this AI model for testing 187 DLBCL without MYC rearrangement (DLBCLn) for the presence or absence of such signature.
RNA was extracted from the lymph node samples of 363 cases with high-grade lymphoma. This included 40 cases of double-hit lymphoma, 19 Burkitt lymphoma, and 304 cases with DLBCLn. The RNA was sequenced by next generation sequencing (NGS) using a targeted RNA panel of 1600 genes. Hybrid capture sequencing library preparation was used and RNA was quantified using transcript per million (TPM). An independent set of the double-hit, Burkitt (total 59) and 117 DLBCLn was used to establish the DZ signature and a set of 187 DLBCLn was used for testing. Bayesian statistics were used to rank the genes that distinguish between DZL and DLBCLn, then eXtreme Gradient Boosting (XGBoost) was used to establish the DZL signature. Two thirds of the first set of cases were used for training and one third was used for testing the model. A score for the combination of relevant genes with a cut-off point was established that distinguishes DZL from DLBCLn. The same Bayesian/XGBoost algorithm was used to test the rest of the DLBCLn cases and to stratify as DZ signature positive vs. negative.
Using 59 DZL cases and 117 DLBCLn cases in the Bayesian/ XGBoost model described above, we show that in testing set DZL can be distinguished from DLBCLn with AUC of 0.927 (95% CI: 0.862-0.993) using only 6 genes (MYC, SMARCA4, PATZ1, MEN1, TCF3, MSI2). Using 50 genes increased the accuracy of distinguishing the two groups to AUC of 0.962 (95% CI: 0.916-1.00). To increase stringency, we used the 50 gene AI model and tested the remaining 187 DLBCLn cases for the presence of the DZ signature. Of these cases, 10 (5%) showed biologically DZ transcriptomic signature. These cases showed a significantly higher level of MYC (P <0.0001), MIB1 (P<0.002), SMARCA4 (0.0001), MEN1(P=0.0001), and BCL6 (P=0.006). In contrast, there was no significant difference in BCL2 mRNA levels between the two groups and BCL2 was not selected by the algorithm for the signature. Of the 10 cases, 8 (80%) were classified as GCB, but 2 were non-GCB. Most of the cases (80%) had overexpression of MYC due to MYC gene copy number gain or amplification.Conclusions: This data shows that targeted RNA transcriptomic data when combined with AI can define a unique transcriptomic signature that identify DZ lymphoma. The data also shows that 14% of DZL cases are missed if diagnosis is strictly based on FISH classification. Using RNA expression of 50 genes in AI model provides a reliable approach to identify these DLBCL patients. Future analysis is planned to analyze nonBL/nonDH DZL outcomes which will include stratification by therapy type.
This feature is available to Subscribers Only
Sign In or Create an Account Close Modal